13 research outputs found

    Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition

    Get PDF
    Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected between speech quality improvement and the increase in recognition accuracy. This paper proposes a novel approach for solving this problem by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy. This will incorporate important information of the statistical models of the recognition engine as a feedback for tuning SS parameters. By using this architecture, we overcome the drawbacks of previously proposed methods and achieve better recognition accuracy. Experimental evaluations show that the proposed method can achieve significant improvement of recognition rates across a wide range of signal to noise ratios

    Clustering based on Mixtures of Sparse Gaussian Processes

    Full text link
    Creating low dimensional representations of a high dimensional data set is an important component in many machine learning applications. How to cluster data using their low dimensional embedded space is still a challenging problem in machine learning. In this article, we focus on proposing a joint formulation for both clustering and dimensionality reduction. When a probabilistic model is desired, one possible solution is to use the mixture models in which both cluster indicator and low dimensional space are learned. Our algorithm is based on a mixture of sparse Gaussian processes, which is called Sparse Gaussian Process Mixture Clustering (SGP-MIC). The main advantages to our approach over existing methods are that the probabilistic nature of this model provides more advantages over existing deterministic methods, it is straightforward to construct non-linear generalizations of the model, and applying a sparse model and an efficient variational EM approximation help to speed up the algorithm

    Two-Dimensional Heteroscedastic Feature Extraction Technique for Face Recognition

    Get PDF
    One limitation of vector-based LDA and its matrix-based extension is that they cannot deal with heteroscedastic data. In this paper, we present a novel two-dimensional feature extraction technique for face recognition which is capable of handling the heteroscedastic data in the dataset. The technique is a general form of two-dimensional linear discriminant analysis. It generalizes the interclass scatter matrix of two-dimensional LDA by applying the Chernoff distance as a measure of separation of every pair of clusters with the same index in different classes. By employing the new distance, our method can capture the discriminatory information presented in the difference of covariance matrices of different clusters in the datasets while preserving the computational simplicity of eigenvalue-based techniques. So our approach is a proper technique for high-dimensional applications such as face recognition. Experimental results on CMU-PIE, AR and AT & T face databases demonstrate the effectiveness of our method in term of classification accuracy

    Improvement in detection of presence in forbidden locations in video anomaly using optical flow map

    Get PDF
    Anomaly detection has been in researchers’ scope of study for a long time. The wide variety of anomaly detection use cases ranges from quality control in production lines to providing security in public places. One of the most attractive topics in anomaly detection is in video surveillance systems. In this paper, we propose a method that works based on frame prediction and optical flow to improve anomaly detection in videos. The use of optical flows in normal frames helps the system to better detect the entrance of people or objects to forbidden areas by its information about the amount of movement in different regions of the frames. Based on the optical flow of normal videos and that of current video, the threshold for anomaly decision is adaptively adjusted. This could ultimately lead to a better overall performance of the anomaly detection system compared to the recent similar works. The presented method is general and can be simply incorporated to other video anomaly detection systems to improve the detection accuracy

    Deep Graph Clustering via Mutual Information Maximization and Mixture Model

    Full text link
    Attributed graph clustering or community detection which learns to cluster the nodes of a graph is a challenging task in graph analysis. In this paper, we introduce a contrastive learning framework for learning clustering-friendly node embedding. Although graph contrastive learning has shown outstanding performance in self-supervised graph learning, using it for graph clustering is not well explored. We propose Gaussian mixture information maximization (GMIM) which utilizes a mutual information maximization approach for node embedding. Meanwhile, it assumes that the representation space follows a Mixture of Gaussians (MoG) distribution. The clustering part of our objective tries to fit a Gaussian distribution to each community. The node embedding is jointly optimized with the parameters of MoG in a unified framework. Experiments on real-world datasets demonstrate the effectiveness of our method in community detection
    corecore